Goal

The main goal of this notebook is to evaluate the usefulness of seqICP when applied on simulations of a simple VAR(1) process.

Sanity check

First, lets run the example for the package documentation so as to check everything is fine.

Let \(e \in \{a,b\}\) be two different environments governed by the following SCM:

\[ x^a_{1,t} = N_a(0.3, 0.09) \\ x^a_{3,t} = x^a_{1,t} + N_a(0.2, 0.04) \\ y^a_{t} = -0.7x^a_{1,t} + 0.6x^a_{3,t} + N_a(0.1, 0.01) \\ x^a_{2,t} = -0.5*y^a_{t} + 0.5*x^a_{3,t} + N_a(0.1, 0.01) \]

\[ x^b_{1,t} = N_b(0.3, 0.09) \\ x^b_{3,t} = N_b(0.5, 0.25) \\ y^b_{t} = -0.7x^b_{1,t} + 0.6x^b_{3,t} + N_b(0.1, 0.01) \\ x^a_{2,t} = -0.5*y^a_{t} + 0.5*x^a_{3,t} + N_a(0.1, 0.01) \]

Thus, a change has occurred on the distribution of \(x^b_{3,t}\) between environments, whereas the SCM of \(y_t\) remained constant. More precisely, we have:

\[ x^a_{3,t} \sim N(x^a_{1,t} + 0.2, 0.04) \\ x^b_{3,t} \sim N_b(0.5, 0.25) \\ \]

and

\[ y^e_{t} = -0.7x^e_{1,t} + 0.6x^e_{3,t} + N_e(0.1, 0.01) \\ \] regardless of \(e\). Lets sample the above process:

set.seed(1)

# environment 1
na <- 140
X1a <- 0.3*rnorm(na)
X3a <- X1a + 0.2*rnorm(na)
Ya <- -.7*X1a + .6*X3a + 0.1*rnorm(na)
X2a <- -0.5*Ya + 0.5*X3a + 0.1*rnorm(na)

# environment 2
nb <- 80
X1b <- 0.3*rnorm(nb)
X3b <- 0.5*rnorm(nb)
Yb <- -.7*X1b + .6*X3b + 0.1*rnorm(nb)
X2b <- -0.5*Yb + 0.5*X3b + 0.1*rnorm(nb)

# combine environments
X1 <- c(X1a,X1b)
X2 <- c(X2a,X2b)
X3 <- c(X3a,X3b)
Y <- c(Ya,Yb)
Xmatrix <- cbind(X1, X2, X3)

svar_sim <- cbind(Y, Xmatrix)

Here are the autocorrelation functions

for (i in 1:dim(svar_sim)[2]){
  print(paste0("x", i))
  ggtsdisplay(svar_sim[, i])
}
## [1] "x1"

## [1] "x2"

## [1] "x3"

## [1] "x4"

and the autocorrelation of the square of the series

for (i in 1:dim(svar_sim)[2]){
  print(paste0("x", i))
  ggtsdisplay(svar_sim[, i]^2)
}
## [1] "x1"

## [1] "x2"

## [1] "x3"

## [1] "x4"

We can note a few interesting observations from the DGP definition and the ACF/PACF.First, from the DGP definition we can conclude that there are only contemporaneous dependencies in the time series. Despite this fact, from a quick inspection of the ACF and PACF’s of the raw time series it seems that the shift on \(x_{3,t}\)’s distribution has introduced some time dependencies on \(y_t\) and itself. For instance, lags two and five for the ACF of \(y_t\) and lag five for the ACF of \(x_{3,t}\) are significant. Third, we can see that shift in \(x_{3,t}\) distribution translates into a shift in \(y_t\)’s variance. From a a visual inspection of the time series of the squares of \(y_t\) and \(x_{3,t}\) we can see that there is a big shift in its variance starting from timestep 130. Furthermore, this shift has introduced arch effects into the time series, which translates into heteroskedasticity of the processes.

From the above observations we can make the following statements about the target \(y_t\):

  1. It is non-stationary
  2. Its DGP is invariant across enviroments
  3. The main shift is in its variance
  4. There are only comtemporaneous dependencies in the DGP

We now run seqICP on the simulations:

seqICP.result <- seqICP(X = Xmatrix,
                        Y = Y,
                        stopIfEmpty=FALSE,
                        silent=FALSE)
## Currently fitting set S = {}
## p-value: 0.02
## Currently fitting set S = {1}
## p-value: 0.02
## Currently fitting set S = {2}
## p-value: 0.02
## Currently fitting set S = {3}
## p-value: 0.02
## Currently fitting set S = {1, 2}
## p-value: 0.02
## Currently fitting set S = {1, 3}
## p-value: 0.91
## Currently fitting set S = {2, 3}
## p-value: 0.02
## Currently fitting set S = {1, 2, 3}
## p-value: 0.22
summary(seqICP.result)
## 
##  Invariant Linear Causal Regression at level 0.05
##  Variables X1, X3 show a significant causal effect
##  
##            coefficient lower bound upper bound  p-value  
## intercept         0.0    -0.05900      0.0179       NA  
## X1               -0.7    -0.75200     -0.5292     0.02 *
## X2                0.0     0.00000      0.0000     0.91  
## X3                0.6     0.57000      0.7228     0.02 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
seqICP.result <- seqICP(X = Xmatrix,
                        Y = Y,
                        model = "ar",
                        stopIfEmpty=FALSE,
                        silent=FALSE)
## Currently fitting set S = {}
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 215 are removed, to account for lag p=5
## p-value: 0.02
## Currently fitting set S = {1}
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 215 are removed, to account for lag p=5
## p-value: 0.02
## Currently fitting set S = {2}
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 215 are removed, to account for lag p=5
## p-value: 0.02
## Currently fitting set S = {3}
## p-value: 0.02
## Currently fitting set S = {1, 2}
## p-value: 0.02
## Currently fitting set S = {1, 3}
## p-value: 0.97
## Currently fitting set S = {2, 3}
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 218 are removed, to account for lag p=2
## p-value: 0.08
## Currently fitting set S = {1, 2, 3}
## p-value: 0.2
summary(seqICP.result)
## 
##  Invariant Linear Causal Regression at level 0.05
##  Variable X3[t] shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value  
## intercept       -0.01     -0.0710      0.0179       NA  
## X1[t]            0.00      0.0000      0.0000    0.079 .
## X2[t]            0.00      0.0000      0.0000    0.970  
## X3[t]            0.44      0.5700      0.7817    0.020 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).

From the above results we can see that seqICP was able to find the correct set of causal parents even if we didn’t know that no lags were present on the original DGP.

Simulated VAR(1) model with \(k=3\) and \(n=100\)

Now we proceed by testing seqICP on a multivariate time series process that has lag dependencies across variables but no clear shift in distribution. Consider the following DGP:

\[ x_{1,t} = 0.2x_{1,t-1} + 0.7x_{3,t-1} + N(0, 1)\\ x_{2,t} = 0.2x_{2,t-1} + N(0, 1)\\\ x_{3,t} = 0.7x_{1,t-1} + 0.2x_{2,t-1} + N(0, 1)\\\ \]

## VAR.sim: simulate VAR as in Enders 2004, p 268
B1 <- matrix(c(0.2, 0, 0.7, 0, 0.2, 0, 0.7, 0.2, 0), nrow = 3, ncol = 3, byrow = TRUE)
var_sim <- VAR.sim(B=B1, n=na+nb, include="none")

Here are the autocorrelation functions

for (i in 1:dim(var_sim)[2]){
  print(paste0("x", i))
  ggtsdisplay(var_sim[, i])
}
## [1] "x1"

## [1] "x2"

## [1] "x3"

and the autocorrelation of the square of the series

for (i in 1:dim(var_sim)[2]){
  print(paste0("x", i))
  ggtsdisplay(var_sim[, i]^2)
}
## [1] "x1"

## [1] "x2"

## [1] "x3"

From a visual inspection of the ACFs and PACFs of the raw and squares of the time series we can notice two main differences. First, all raw time series have a much stronger time dependence which is induced by the lags of the time series. Furthermore, there is no clear shift in mean and/or variance of the series. Therefore we can conclude that the process is jointly stationary.

var_sim_df <- var_sim %>% as.data.table()
colnames(var_sim_df) <- c("x1", "x2", "x3")
varnames <- colnames(var_sim_df)

Ss <- list()
Sdetails <- list()
for (vn in varnames){
  X <- var_sim_df %>% dplyr::select(-one_of(vn))
  y <- var_sim_df %>% dplyr::select(one_of(vn))
  
  output <- seqICP(X=X, Y=y, model = "ar",stopIfEmpty = FALSE, silent = TRUE)
  Sdetails[[vn]] <- output
  
  print(paste0("Target: ", vn, " Features: ", paste(colnames(X), collapse = " ")))
  summary(output)
}
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1

## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1

## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## [1] "Target: x1 Features: x2 x3"
## 
##  Invariant Linear Causal Regression at level 0.05
##  No variable shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value
## intercept       -0.03     -0.1878      0.0984       NA
## X1[t]            0.00      0.0000      0.0000     0.95
## X2[t]            0.00      0.0000      0.0000     0.93
## Y0[t-1]          0.20     -0.1345      0.2936       NA
## X1[t-1]          0.00     -0.1141      0.2947       NA
## X2[t-1]          0.67     -0.1636      0.7667       NA
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1

## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1

## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1

## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 219 are removed, to account for lag p=1
## [1] "Target: x2 Features: x1 x3"
## 
##  Invariant Linear Causal Regression at level 0.05
##  No variable shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value
## intercept        0.05     -0.0950       0.205       NA
## X1[t]            0.00      0.0000       0.000        1
## X2[t]            0.00      0.0000       0.000        1
## Y0[t-1]          0.10     -0.1738       0.231       NA
## X1[t-1]          0.05     -0.1732       0.245       NA
## X2[t-1]         -0.07     -0.1886       0.244       NA
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 212 are removed, to account for lag p=8
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 212 are removed, to account for lag p=8
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## [1] "Target: x3 Features: x1 x2"
## 
##  Invariant Linear Causal Regression at level 0.05
##  No variable shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value
## intercept        0.22      0.0430       0.373       NA
## X1[t]            0.00      0.0000       0.000     0.22
## X2[t]            0.00      0.0000       0.000     0.36
## Y0[t-1]          0.01     -0.1340       0.319       NA
## X1[t-1]          0.65     -0.2700       0.794       NA
## X2[t-1]          0.28     -0.2700       0.796       NA
## Y0[t-2]          0.07     -0.1060       0.758       NA
## X1[t-2]         -0.06     -0.2250       0.404       NA
## X2[t-2]         -0.08     -0.2260       0.220       NA
## Y0[t-3]         -0.08     -0.2540       0.142       NA
## X1[t-3]          0.01     -0.2560       0.177       NA
## X2[t-3]         -0.03     -0.2570       0.177       NA
## Y0[t-4]         -0.15     -0.3200       0.175       NA
## X1[t-4]          0.03     -0.3200       0.201       NA
## X2[t-4]          0.18     -0.2720       0.314       NA
## Y0[t-5]          0.02     -0.1530       0.314       NA
## X1[t-5]          0.02     -0.1530       0.287       NA
## X2[t-5]          0.04     -0.1810       0.191       NA
## Y0[t-6]         -0.09     -0.2600       0.174       NA
## X1[t-6]          0.09     -0.2600       0.261       NA
## X2[t-6]         -0.08     -0.2090       0.261       NA
## Y0[t-7]         -0.09     -0.2540       0.209       NA
## X1[t-7]          0.17     -0.2540       0.344       NA
## X2[t-7]          0.03     -0.1050       0.344       NA
## Y0[t-8]         -0.10     -0.2440       0.160       NA
## X1[t-8]          0.03     -0.2440       0.168       NA
## X2[t-8]          0.10     -0.1040       0.227       NA
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).

From the above results we can see that seqICP was not able to find the correct set of causal parents for any of the equations. This result actually makes sense since our original DGP is stationary, and thus violates one of the assumptions of seqICP.

Variance Shifted VAR(1)

To solve for the stationarity issue we had before, we will try to artificially introduce a variance shift into the VAR(1) process we defined before.

To do so, recall that \(e \in \{a,b\}\) are two different environments. We define the following DGP for the variance shifted VAR(1) process:

\[ x^a_{1,t} = 0.2x^a_{1,t-1} + 0.7x^a_{3,t-1} + N(0, 1)\\ x^a_{2,t} = 0.2x^a_{2,t-1} + N(0, 1)\\ x^a_{3,t} = 0.7x^a_{1,t-1} + 0.2x^a_{2,t-1} + N(0, 1)\\ \]

\[ x^b_{1,t} = 0.2x^b_{1,t-1} + 0.7x^b_{3,t-1} + N(0, 1)\\ x^b_{2,t} = 0.2x^b_{2,t-1} + N(0, 1)\\ x^b_{3,t} = 0.7x^b_{1,t-1} + 0.2x^a_{2,t-1} + N(0, 5)\\ \]

Thus, the SCM of \(x_{1,t}\) is the same regardless of \(e\) but there is a variance shift between the environments caused by a change in distribution in \(x_{3,t-1}\).

Below we will sample from the above model:

set.seed(1)
n <- na + nb

x1 <- matrix(data = NA, nrow = n, ncol = 1)
x2 <- matrix(data = NA, nrow = n, ncol = 1)
x3 <- matrix(data = NA, nrow = n, ncol = 1)

for (i in 1:n){
  
  # initialize all series
  if (i == 1){
    x1[i,] <- rnorm(n = 1, mean = 0, sd = 1)
    x2[i,] <- rnorm(n = 1, mean = 0, sd = 1)
    x3[i,] <- rnorm(n = 1, mean = 0, sd = 1)
    next
  }
  
  # sample time series iteratively conditioned on each environment
  if (i <= na){
    x1[i,] <- 0.2*x1[i-1,] + 0.7*x3[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
    x2[i,] <- 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
    x3[i,] <- 0.7*x1[i-1,] + 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
  }
  else{
    x1[i,] <- 0.2*x1[i-1,] + 0.7*x3[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
    x2[i,] <- 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 1)
    x3[i,] <- 0.7*x1[i-1,] + 0.2*x2[i-1,] + rnorm(n = 1, mean = 0, sd = 3)
  }
  
}

var_sim_shifted <- cbind(x1, x2, x3)
ts.plot(var_sim_shifted, type="l", col=c(1, 2, 3))

for (i in 1:dim(var_sim_shifted)[2]){
  print(paste0("x", i))
  ggtsdisplay(var_sim_shifted[, i])
}
## [1] "x1"

## [1] "x2"

## [1] "x3"

and the autocorrelation of the square of the series

for (i in 1:dim(var_sim_shifted)[2]){
  print(paste0("x", i))
  ggtsdisplay(var_sim_shifted[, i]^2)
}
## [1] "x1"

## [1] "x2"

## [1] "x3"

From the above ACF and PACFs we can see that the stationarity property was broken by introducing the shift in variance on \(x_{3,t}\).

Finally, we run seqICP on the new dataset:

var_sim_shifted_df <- var_sim_shifted %>% as.data.table()
colnames(var_sim_shifted_df) <- c("x1", "x2", "x3")
varnames <- colnames(var_sim_shifted_df)

Ss <- list()
Sdetails <- list()
for (vn in varnames){
  X <- var_sim_shifted_df %>% dplyr::select(-one_of(vn))
  y <- var_sim_df %>% dplyr::select(one_of(vn))
  
  output <- seqICP(X=X, Y=y, model = "ar", stopIfEmpty = FALSE, silent = TRUE, )
  Sdetails[[vn]] <- output
  
  print(paste0("Target: ", vn, " Features: ", paste(colnames(X), collapse = " ")))
  summary(output)
}
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7

## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7

## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 213 are removed, to account for lag p=7
## [1] "Target: x1 Features: x2 x3"
## 
##  Invariant Linear Causal Regression at level 0.05
##  No variable shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value
## intercept        0.10    -0.05900      0.2572       NA
## X1[t]            0.00     0.00000      0.0000        1
## X2[t]            0.00     0.00000      0.0000        1
## Y0[t-1]          0.29    -0.11800      0.4241       NA
## X1[t-1]         -0.04    -0.19600      0.4247       NA
## X2[t-1]          0.03    -0.21300      0.4244       NA
## Y0[t-2]          0.40    -0.21500      0.5382       NA
## X1[t-2]          0.00    -0.16100      0.5440       NA
## X2[t-2]          0.02    -0.16700      0.5443       NA
## Y0[t-3]         -0.05    -0.20300      0.1621       NA
## X1[t-3]         -0.08    -0.24000      0.1006       NA
## X2[t-3]          0.04    -0.24300      0.1105       NA
## Y0[t-4]          0.05    -0.24300      0.2016       NA
## X1[t-4]         -0.08    -0.24300      0.2032       NA
## X2[t-4]         -0.01    -0.24600      0.2018       NA
## Y0[t-5]         -0.06    -0.24700      0.0872       NA
## X1[t-5]          0.14    -0.22100      0.3035       NA
## X2[t-5]         -0.04    -0.22600      0.3071       NA
## Y0[t-6]         -0.10    -0.23600      0.3057       NA
## X1[t-6]         -0.18    -0.34900      0.0491       NA
## X2[t-6]          0.01    -0.34800      0.0738       NA
## Y0[t-7]          0.15    -0.34400      0.2834       NA
## X1[t-7]         -0.11    -0.27200      0.2885       NA
## X2[t-7]          0.04    -0.27600      0.2894       NA
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## 
## [1] "Target: x2 Features: x1 x3"
## 
##  Invariant Linear Causal Regression at level 0.05
##  No variable shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value
## intercept        0.04     -0.1060       0.179       NA
## X1[t]            0.00      0.0000       0.000     0.48
## X2[t]            0.00      0.0000       0.000     0.44
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).
## Warning in seqICP.s(X, Y, S[[1]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6

## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6

## Warning in seqICP.s(X, Y, S[[ind]], test, par.test, model, par.model): grid
## points larger than 214 are removed, to account for lag p=6
## [1] "Target: x3 Features: x1 x2"
## 
##  Invariant Linear Causal Regression at level 0.05
##  No variable shows a significant causal effect
##  
##            coefficient lower bound upper bound  p-value
## intercept        0.17    -0.00100      0.3401       NA
## X1[t]            0.00     0.00000      0.0000        1
## X2[t]            0.00     0.00000      0.0000        1
## Y0[t-1]          0.11    -0.09800      0.2444       NA
## X1[t-1]          0.07    -0.09800      0.2461       NA
## X2[t-1]          0.04    -0.12300      0.2455       NA
## Y0[t-2]          0.43    -0.14600      0.5670       NA
## X1[t-2]          0.08    -0.14300      0.5772       NA
## X2[t-2]         -0.02    -0.18500      0.5791       NA
## Y0[t-3]          0.02    -0.19100      0.1764       NA
## X1[t-3]         -0.06    -0.19100      0.1778       NA
## X2[t-3]         -0.17    -0.33900      0.1786       NA
## Y0[t-4]          0.00    -0.34100      0.1539       NA
## X1[t-4]         -0.05    -0.34100      0.1524       NA
## X2[t-4]          0.05    -0.15200      0.2208       NA
## Y0[t-5]          0.01    -0.14400      0.2180       NA
## X1[t-5]          0.04    -0.14600      0.2134       NA
## X2[t-5]          0.04    -0.14700      0.2103       NA
## Y0[t-6]         -0.05    -0.19000      0.2082       NA
## X1[t-6]         -0.05    -0.19200      0.2045       NA
## X2[t-6]         -0.17    -0.33300      0.0885       NA
## 
## Confidence intervals are only approximate due to post selection (pknown is FALSE).

As we can see, seqICP was not able to identify the invariant set correctly for the variance shifted VAR(1) process.